29 research outputs found
An energy-based model to optimize cluster visualization
National audienceGraphs are mathematical structures that provide natural means for complex-data representation. Graphs capture the structure and thus help modeling a wide range of complex real-life data in various domains. Moreover graphs are especially suitable for information visualization. Indeed the intuitive visualabstraction (dots and lines) they provide is intimately associated with graphs. Visualization paves the way to interactive exploratory data-analysis and to important goals such as identifying groups and subgroups among data and helping to understand how these groups interact with each other. In this paper, we present a graph drawing approach that helps to better appreciate the cluster structure in data and the interactions that may exist between clusters. In this work, we assume that the clusters are already extracted and focus rather on the visualization aspects. We propose an energy-based model for graph drawing that produces an esthetic drawing that ensures each cluster will occupy a separate zone within thevisualization layout. This method emphasizes the inter-groups interactions and still shows the inter-nodes interactions. The drawing areas assigned to the clusters can be user-specified (prefixed areas) or automatically crafted (free areas). The approach we suggest also enables handling geographically-based clustering. In the case of free areas, we illustrate the use of our drawing method through an example. In the case of prefixed areas, we firstuse an example from citation networks and then use another exampleto compare the results of our method to those of the divide and conquer approach. In the latter case, we show that while the two methods successfully point out the cluster structure our method better visualize the global structure
Exploring Semi-supervised Hierarchical Stacked Encoder for Legal Judgement Prediction
Predicting the judgment of a legal case from its unannotated case facts is a
challenging task. The lengthy and non-uniform document structure poses an even
greater challenge in extracting information for decision prediction. In this
work, we explore and propose a two-level classification mechanism; both
supervised and unsupervised; by using domain-specific pre-trained BERT to
extract information from long documents in terms of sentence embeddings further
processing with transformer encoder layer and use unsupervised clustering to
extract hidden labels from these embeddings to better predict a judgment of a
legal case. We conduct several experiments with this mechanism and see higher
performance gains than the previously proposed methods on the ILDC dataset. Our
experimental results also show the importance of domain-specific pre-training
of Transformer Encoders in legal information processing.Comment: Published in the 1st International Workshop on Legal Information
Retrieval at ECIR 2023, April 2nd 2023, Dublin, Ireland.
(https://tmr.liacs.nl/legalIR/
Recherche et représentation de communautés dans des grands graphes
15 pagesNational audienceThis paper deals with the analysis and the visualization of large graphs. Our interest in such a subject-matter is related to the fact that graphs are convenient widespread data structures. Indeed, this type of data can be encountered in a growing number of concrete problems: Web, information retrieval, social networks, biological interaction networks... Furthermore, the size of these graphs becomes increasingly large as the progression of the means for data gathering and storage steadily strengthens. This calls for new methods in graph analysis and visualization which are now important and dynamic research fields at the interface of many disciplines such as mathematics, statistics, computer science and sociology. In this paper, we propose a method for graphs representation and visualization based on a prior clustering of the vertices. Newman and Girvan (2004) points out that âreducing [the] level of complexity [of a network] to one that can be interpreted readily by the human eye, will be invaluable in helping us to understand the large-scale structure of these new network dataâ: we rely on this assumption to use a priori a clustering of the vertices as a preliminary step for simplifying the representation of the graphs - as a whole. The clustering phase consists in optimizing a quality measure specifically suitable for the research of dense groups in graphs. This quality measure is the modularity and expresses the âdistanceâ to a null model in which the graph edges do not depend on the clustering. The modularity has shown its relevance in solving the problem of uncovering dense groups in a graph. Optimization of the modularity is done through a stochastic simulated annealing algorithm. The visualization/representation phase, as such, is based on a force-directed algorithm described in Truong et al. (2007). After giving a short introduction to the problem and detailing the vertices clustering and representation algorithms, the paper will introduce and discuss two applications from the social network field
Approaches through which anticipation informs climate governance in South Asia
This report presents the RE-IMAGINE research in one of its four regions: South Asia. RE-IMAGINE builds on climate foresight expertise of the Climate Change, Agriculture and Food Security (CCAFS) Program and analyses the role of foresight in climate governance across the globe. Scenarios and many other methods and tools are used today to imagine climate futures and develop strategies for realizing new futures while governing climate change. With the proliferation of these processes in sustainability-related research and planning contexts, scrutiny of their role in steering climate actions in the present becomes increasingly important. How can the benefits and challenges of these processes of anticipation be better understood as governance interventions? At the same time, research into anticipatory climate governance processes in the Global South has remained very limited, while these regions are most vulnerable to climate change. The RE-IMAGINE report therefore examines processes of anticipation in four regions of the Global South. The research question we answer in this report is: âthrough what approaches are diverse processes of anticipation used to govern climate change in diverse South Asian contexts?â. In order to answer this question, we first examine what methods and tools are used to anticipate climate futures and their role in climate policy and decision-making. We then closely examine three case studies to understand their approaches to anticipatory governance. Additionally, we present the results of two regional meetings with stakeholders where we discussed the challenges that exist in each country to practice anticipatory climate governance and the opportunities to strengthen capacities in this field. Finally, we present recommendations for strengthening processes of anticipatory climate governance in the region
ViAGraph : a Tool for Graph Visualization and Analysis
Graphs are common representations that can capture the structure and then can model a wide range of data and knowledge. In this paper, we present and discuss the functionalities of ViAGraph a tool for graph visualization and analysis. ViAGraph is meant to assist the user in exploring raw information in order to unveil interesting and useful information thru both query/answer and interactively guided data examination interactions. The paper presents a bunch of ideas and techniques related to graph visualization and exploration. Our main contributions are: 1. We propose a new approach of node placement based on âgeographicâ constraints. 2. We discuss a novel analysis method based on graph comparison. Strengths and weaknesses of the proposed methods are discussed
Visualisation graphique basée sur un modÚle d'énergie pour la représentation d'un graphe de citations
International audienceNous proposons un nouveau modĂšle Ă base dâĂ©nergie pour reprĂ©senter des graphes. Par rapport aux modĂšles dĂ©jĂ existants, notre modĂšle permet, au-delĂ de la visualisation de la structure inhĂ©rente au graphe reprĂ©sentĂ©, de mieux visualiser des structures issues de propriĂ©tĂ©s particuliĂšres du graphe. Notre modĂšle sâappuie sur un modĂšle dâĂ©nergie et complĂšte le modĂšle de Fruchterman et Reingold (1991). Lâalgorithme que nous suggĂ©rons introduit des liens « invisibles » entre nĆuds qui sont utilisĂ©s lors du positionnement des diffĂ©rents composants graphiques
Abstract TREC NOVELTY TRACK AT IRIT â SIG
relevant if it matches the topic with a certain level of coverage. This coverage depends on the category of terms used in the texts. Different types of terms have been defined: highly relevant, scarcely relevant, nonrelevant and highly non-relevant. With regard to the novelty part, a sentence is considered as novel if its similarity with previously processed sentences and with the n-best-matching sentences does not exceed certain thresholds.
Energy model for clustered-graph visualization
FAIR : Factor Analysis of Information RiskNational audienc
Mining Information in Order to Extract Hidden and Strategic Information
The amount of information available throughout the Internet or through specific collections is so huge that more and more sophisticated information handling systems are necessary to exploit it. In addition to efficient retrieval engines, the users need some tools to be able to analyse the relevant information without having to read all of it. Knowledge Discovery Systems main objective is to turn some selected pieces of raw information into knowledge or generalized patterns. Such a process includes a lot of problems to solve all over the three knowledge discovery phases: information harvesting and selection, information mining, results displaying. In this paper we present an interactive method to achieve knowledge discovery. It is based on information harvesting, homogenization and filtering. The discovery process itself is achieved making several modules cooperate : different mining functions and visualization modules dynamically interact-directed by the user. The operational software ..
Abstract NOVELTY TRACK AT IRIT â SIG
IRIT developed a new strategy in order to detect the relevant sentences that we did not try in a more general context of document retrieval but did try previously and partially in document categorization. In our approach a sentence is considered as relevant if it matches the topic with a certain level of coverage. This level of coverage depends on the category of the terms used in the texts. Three types of terms have been defined: highly relevant, lowly relevant and no relevant. With regard to the novelty part, a sentence is considered as novel when its levels of coverage with the previously processed sentences and with the bestmatching sentences do not exceed certain thresholds.